Speech Emotion Recognition Based on Deep Belief Networks and Wavelet Packet Cepstral Coefficients

نویسندگان

YONGMING HUANG

Yongming Huang

Ao Wu

Guobao Zhang

Yue Li

چکیده

A wavelet packet based adaptive filter-bank construction combined with Deep Belief Network(DBN) feature learning method is proposed for speech signal processing in this paper. On this basis, a set of acoustic features are extracted for speech emotion recognition, namely Coiflet Wavelet Packet Cepstral Coefficients (CWPCC). CWPCC extends the conventional MelFrequency Cepstral Coefficients (MFCC) by adapting the filter-bank structure according to the decision task. And Deep Belief Networks (DBNs) are artificial neural networks having more than one hidden layer, which are first pre-trained layer by layer and then fine-tuned using back propagation algorithm. The well-trained deep neural networks are capable of modeling complex and non-linear features of input training data and can better predict the probability distribution over classification labels. Speech emotion recognition system is constructed with the feature set, DBNs feature learning structure and Support Vector Machine as classifier. Experimental results on Berlin emotional speech database show that the Coiflet Wavelet Packet is more suitable in speech emotion recognition than other acoustics features and proposed DBNs feature learning structure combined with CWPCC improve emotion recognition performance over the conventional emotion recognition method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Speech Recognition using Wavelet Packet Features

In view of the growing use of automatic speech recognition in the modern society, we study various alternative representations of the speech signal that have the potential to contribute to the improvement of the recognition performance. Specifically, the main targets of the present article are to overview and evaluate the practical importance of some recently proposed, and thus less studied, wa...

متن کامل

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...

متن کامل

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Speech Emotion Recognition Based on Deep Belief Networks and Wavelet Packet Cepstral Coefficients

نویسندگان

چکیده

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Recognition using Wavelet Packet Features

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

عنوان ژورنال:

اشتراک گذاری